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^Tlde: ANCHOR LIKU^RIES AND IDENTIHCATION OF PEPTIDE BIND^^ 
(^Abstract 

An andior library is described. A coUectioa €^ recombinant vcctoss baving a nodeic acid encoding a di^layed pqptide sequence is 
pnjvided Tbe displayed p^tule sequeiH:e of each of ^ 

is m amino acid lesidne and any of X?, X? and X^ can be die same or dilfexeot fiom airy one odier, wherein eadi and is 
alanine or tfy^ or a combination of danfae tod gjl^^ 

if present can be die same or different fiom any <nie odier, wliefi^ each of c', ^ and ^ is 0 to about 20, ^dieiein X^ and X*^ are each 
attadbed to an amino add readue that flanks die diq>layed pqptlde sequence. Fo^rably, at least about lO' to about 10* p e rmu t atio ns of all 
possibfe pennutadons of die di^layed peptide sequence aiepissent in die Fr^siably, die library does not contain mcae dian 

d>out 10 % dT diq>layed p^>tide sequences different fiom die first mentiwied displayed pqMide sequences* Ako described are mediods of 
making anchor libraries and mediods of using andior libraries to identij^ a peptide sequence diat binds to a target* R eoomib i nant Yedors^ 
filame^Dus fdiage, nodeic add molecules ai^ proteins are also provided* ^ 
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ANCHOR LIBRARIES AND mENTIFICATIQN OF 

Field of the Invention 

5 This invention relates to anchor libraries and to methods of using anchor libraries to 

identify pq)tide sequences that bind to a target molecule. 

Background of the Invention 
The identification of peptides vMch bind to target molecules which are involved in various 
physiological functions, can have significant implications for the diagnosis and/or treatment of 
various abnormal or diseased conditions. For exanq)le^ a binding pqytidenught modulate the 
original activity of the target molecule and therefore be usefid as a diiig. 

The use of standard libraries to idratify peptide sequences vftioh spedficaUy bind to target 
molecules is generally limited to pre-existing natural sequences &om &e organism vMch is the 
source of the DNA. More recratly, libraries have been described which have clones containing 
short synthetic random coding sequences. See> e. g., Scott and Smith, Science 249:386-390 
(1990); Cwirla et al., Proc. Natl. Acad. Sci. USA 82:6378-6382 (1990); Devlin et al.. Science 
249:404-406 (1990). These libraries are mixtures of filamentous phage clones, each displaying a 
random peptide sequence on the virion surfece. In these types of libraries, the random amino 
acids are contiguous. The size of the peptides that can be screened for binding peptides in such 
contiguous random amino acid libraries is linoited, in that as the size of the pq>tides increases, at 
some point it is not feasible to adequately search such a library since there are too many clones 
required to cover all possible permutations of the random amino acids in the peptides. 

25 Summary of the Invention 

It is an object of the invention to identify peptide sequences that bind to specific target 
molecules. 

It is anotiiCT object of the invention to identify amino acid residues in a peptide that are 
important contacts between the peptide and a target molecule. 
30 It is another object of the invention to determine where amino acid residues in a peptide 

that are inqportant contacts between the peptide and a target molecule, are best positioned within 
tilie peptide. 
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It is another object of the invention to use an anchor library in v^ch the random amino 
acid residues of the library are not contumous, for identifying amino acid residues in a peptide 
that axe important contacts between the peptide and a target molecule. 

It is another obj ect of the invention to use an anchor library in wiiich the random amino 
S acid readues of the library are distributed tiirougjiout a much larger peptide domam consisting of 
random glycine and/or alanine residues, for identifying amino add residues in a pqplide that are 
irrq>ortant contacts between the peptide and a target molecule. 

It is another object of the invration to search large peptide phage display libraries o^ e.g., 
1 6 mers» for a reduced number of essential amino acid residue contacts, e.g^ four, between tiie 
10 peptide and a target molecule. 

It is another object of the invention to identify a consensus sequence of a defined number 
of amino acid residues in any configuration of spacer amino adds, tiiat are important contacts 
between a peptide and a target molecule. 

It is yet anotiier obj ect of the invention to use a known core binding sequence on a peptide 
IS \^ch binds to a tar^ molecule, and idaotify surrounding amino add residues i^ch are 
additional important contacts between the peptide and the target molecule. 

StiU anothCT object of the invention is to identify cysteine residues on a peptide \^ch can 
form disulfide bridges and thereby increase the binding affinity of the peptide with a target 
molecule. 

20 According to the invention, an anchor library is provided. The aiichor library conqxrises a 

collection of recombinant vectors, e.g.. viruses, phage, e.g., filamentous phage, plasmids or 
cosmids. Each of the vectors has a nucleic acid sequence inserted in a gene, e.g., a coat protein 
g^e, e.g., gene in or gene vm, thioredoxin, staphnuclease, lac repressor, gal4 or an antibody. 
The nucldc add sequence encodes a displayed peptide sequence, e.g., displayed on tiie surfece 

25 of a virion, cell, spore or gene product, which comprises: 

wherein each X^ X^, X^ and X* is an amino acid residue and any of X^ X?, X? and X"* can be the 
same or different from any one other, v^erein each Y^ and Y^ is alanme or glydne or a 
combination of alanine and glycine tiiat is respectively, c\ c^ and c^ amino add residues long and 
any of Y ^ Y^ and Y^ if present can be the same or different fiom any one otiier, wherein each of 
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c*, and preferably is 0 to about 20, more preferably is 0 to about 10, even more preferably is 
0 to about 6, or most preferably is 0 to about 4, wherein and are each attached to an amino 
add residue that flanks the displayed peptide sequence. In certain embodiments, at least about 
1 to about 1 0' pennutations of all possible permutations of die displayed peptide sequence dirt 

5 present in the anchor library. In other embodiments, the library does not contain more than about 
10%, or more than about 1%, or more than about 0*1%» of displayed peptide sequences different 
fiom the first mentioned displayed peptide sequences. 

Another aspect of the invention is vyiiere each , and is ai^ specified amino add or 
combination of qiedfied amino adds, e.g., alanine or cysteine or a combination of alanine and 

10 cysteine; or glycine or cysteine or a combination of glycine and cysteine. 

In certain embodimrats, die displayed peptide sequence further has at least one core 
bmdiiig sequence vvfaich is preferably about 1 to about 20 amino add residues in length, more 
preferably about 4 to about 10, and most preferably is 6. Theccxrebindmgsequmcecanbein 
addition to, or a replacement for, ptiier amino adds in the displayed peptide sequence. 

1 5 Variations include the presmce of more dian one core binding sequence in die displayed peptide 
sequence, where, e.g.^ the core binding sequences can be adjacent, or not adjacent, to each other, 
and where they can be, e.g., identical or not identical to each other. 

In oth^ embodiments, the displayed peptide sequence further has at least one constraint, 
e.g., a crosslink, e.g., a disulfide bond, e.g., fit>m die presmce of a cysteme residue; a stacking 

20 interaction; a positive or negative charge; hydrophobidty; hydrophilicity; a structural motii^ e.g., 
a zinc fmger formation, a leucine zipper, or a P-tum structure, e.g., fix)m the presence of the 
sequence asp gly or pro gly; or combinations thereof. Cysteine residues can be in addition to, or 
a replacement for, other amino acids in the displayed pq>tide sequence. 

Another aspect of the invention is a mediod of making an anchor library. A collection of 

2S nucleic acid sequences is synthesized. The nucleic add sequences are inserted into vectors to 
give recombinant vectors and the recombinant vectors are introduced into a host The host 
having die recombinant vectors is propagated so as to result in acollection of recombinant 
vectors, each of which has a nucleic acid sequence fix>m the collection of nucleic acid sequences 
v^ch encodes a displayed peptide sequence comprising: 

30 
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Another aspect of the invention is a method of using an anchor library to identify apeptide 
. sequence timt binds to a target An anchor library having a collection of recombinant vectors is 
provided. Each of the recombinant vectors has a nucleic acid sequence \^ch encodes a 
displayed peptide sequence comprising: 

5 E)qxression and display of &epqptide sequence is pennitted The anchor libraiy is contacted 
tiie target, e.g., proteinaceous or non-protemaceous molecules, e.g^ Uganda receptors, 
hormones, cytokines, antibodies, antigens, enzymes, enzyme substrates or viruses, mida 
conditions in i^vfaich tiie displayed peptide sequence bmds to tiie target, and the displayed peptide 
sequence vAnch binds to the target is identified, e.g., by sequencing the nucleic acid sequence on 

10 the recombinant vector ^ch ratcodes for tiie displayed peptide sequence. Preferably, the 
identified displ^ed peptide sequence is synthesized 

The invention also provides for a pqytide ^ch is identified by use of an andior library, in 
which the peptide is usefiil as a diagnostic or tiierq)eutic product in that the peptide is able to 
bind to a target molecule which is involved in a physiological process. 

IS Other aspects of the invention include, e.g., a collection of recombinant DNA molecules 

encoding peptide sequences having a plurality of different binding domains; a recombinant 
filamentous phage having a displayed peptide sequence with known binding properties and 
which is foreign to the filamentous phage; a recombinant vector having a nucleic acid sequence 
inserted in a gene, the nucleic acid sequence encoding a displayed peptide sequence having 

20 known binding properties; a recombinant nucleic acid molecule having a nucleic add sequence 
inserted in a gene, the nucleic acid sequence encoding a displayed peptide sequence having 
known binding properties; and a recombinant protein having a displayed peptide sequence 
having known binding properties. 

The above and other objects, features and advantages of the present invention will be 

25 better understood fiom the following specification. 
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Detailed rtesciription 

This invention provides an anchor library. The anchor library comprises a collection of 
recomlnnant vectors, each of \^ch has a nucldc add sequence Thenucleic 
acid sequence oicodes a displayed peptide sequence \wfaich conqirises: 

5 \tAerein each X\}i?,Xi' and is an amino add residue and ai^ of X^^^ 

same or different fiom any one other, ^erdn each , Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively, c^ c^ and c? amino adds residues long 
and any of Y\ Y^ and Y^ if present can be the same or different fiom any one other, \^erein each 
of c\ c^ and c^ is 0 to about 20, wfaerdn X^ and X^ are eadi attadied to an amino acid residue fhat 

10 flanks the diq>layed peptide sequence. In obtain embodiments at least about 10^ to about 10^ 
pemiutations of all possible pranutations of the displayed peptide sequence are present in the 
anchor library. In other embodiments, the library does not contain more than about 10%, or 
more than about 1%, or more than about 0.1% of displayed peptide sequences different from tiie 
first mentioned displayed peptide sequences. 

IS By anchor library is meant a library in ^^ch the recombinant vectors have nucleic add 

sequences which code for peptide sequences with random amino acids m vMdh the random 
amino acids are not continuous. An anchor library is thus distinguishable fiom other random 
amino acid libraries in which all random amino acids in the peptide sequence of interest are 
contiguous. In anchor libraries, a given number of random amino mds are distributed 

20 throughout a larger peptide domain consisting of specifically designated amino add residues. 
Anchor libraries are meant to include, e.g., external libraries, e.g., phage display libraries, and 
internal libraries, e.g., plasmid libraries. Chemical libraries can be anchor libraries. 

Vectors are meant to include, e.g., phage, viruses, plasmids, cosmids, or any other suitable 
vector known to those skilled in the art. The vector has a gene, native or fordgn, which is able to 

25 tolerate insertion of a foreign peptide into the gene product of the gene. By gene is meant an 
intact gene or fragment thereof In the invention, the expressed gene product contains the 
inserted pq)tide. 

For certain embodiments of this invention, e.g., y/bext phage display libraries are 
employed, the preferred vectors are filamentous phage, tiiough oth^ vectors can be used 
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FilameDtous phage are single stranded DNA phage having coat proteins. Preferably, the gene that 
fhe nucleic acid sequence is inserted into b a coat protdn gene of the filamootous phage. 
Preferred coat proteins are gene m or gene VIII coat proteins. Insertion of a foreign peptide into 
a coat protdn gene results in &e display of the foreign peptide on the surfece of the phage. 

S Insertion iiito any other gene product in \)(4iich the insatedpq)tide is displayed can ^ 

in this invention. Exanoples of filamentous jAxage vectors yMctk can be used in tins invention are 
flJSE vectors, e.g., fUSEl, fUSE2, fUSE3 and fUSE5, in whidi the insertion is just downstream 
of the pin signal peptide. Smith and Scott, Methods in Enzymology 217 -.228-257 (1993). 

In other embodiments, e.g., 'siAcrc mlemal libraries are employed, the jnefecred vectras are 

10 plasmids,Aough other vectors can be used. The gene fliat the nucleic add is ins«tedmto is a 
gene which also results in display of the inserted pqrtide sequence. The gene can encode for an 
exported or non-exported gene product Preferred genes include, e.g., thioredoxin, 

sts^hnnclease, lac repressor, gal4 or an antibody. 

By recoinbiiiant vectOT is meaiit a vector having a niicleic add sequence 

15 nramally present mfte vector. The nucldc add sequence is inserted into a ^nepresait on the 
vector. Insertion of a nucldc add into a gene is meant to include insertion widiin fl« gene or 
immediately 5' or 3' to, respectively, the beginning or end of the gene, such that vAxxi escpressed, 
a fiidon gene product is made. The nucldc acid sequoice that is inserted includes, e.g., a 
syndiesized nucleic add sequooce or a tcagjaassA of another nucldc add molecule. The nucldc 

20 add sequence encodes a displayed peptide sequence. 

By displayed peptide sequence is meant a pep&ds sequoice diat is on the sur&ce of, e.g., a 
virion, e.g. a phage or virus, a cell, a spore, or an expressed gene product It is preiferable to have 
die displayed pqptide displayed such that it is able to bind to added target molecules. A 
displayed peptide sequence can be identical to, or not identical to, a naturally occurring peptide 

25 sequmce. 

The displayed peptide sequence can vary in size. As the size inoceases, die cooq>lexity of 
the anchor library increases, such that at some point a complete library is not obtainable. 
Complete libraries or incomplete libraries can be used in tiiis invention. In certain anbodiments, 
tiie complexity of the anchor library is at least about 10» to about 10"- Preferably, die complexity 
30 is at least about 10*. It is preferred tiiat die total size of the displayed peptide sequence (die 
random amino adds plus die spacer amino adds) should not be greater dian about 100 amino 
adds long, more preferably not greater than about 50 amino adds long, and most preferably not 
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greater than about 25 amino acids long. A particularly preferred library is made up of displayed . 
peptides in i^ch the longest of the peptides is 16 amino acids, i.e., a 16 m^ library. 

In large standard libraries, e.g., of 16mer5 or greater, it is ordinarily not possible to search 
a library i^ch contains all possible combinations of the 16 random amino adds. A m^or 
5 advantage of the anchor libraries of this invention is tiiat tiiese large libraries can be searched by 
looking for a reduced number of essratial amino acid contacts between the pq)tides and the 
target Prefoably, the number of essential amino acid contacts should be suffidoxt to achieve 
micromolar binding Preferably, the reduced number ofessentialcoiitacts is aboutlh^ 
ten, and most preferably it is about four. See Example 4. Thus, e.g., the number of combinations 

10 of four amino acid residue contacts in a 16 mer library is much less than tiie total number of 
combinations of all 1 6 amino adds in a 1 6 mer library, and therefore, this invration makes it 
possible to determine four important contact amino adds in a peptide of 16 amino adds in 
lengtii, as opposed to standard scre^iing of standard libraries in vAsidh such determinations 
cannot ordinarily be made. 

IS In one embodiment of the invention, the displajred peptide sequence conq»rises 

X', X^, and X"* are amino acid residues, each of \^^ch can be tiie same or different firom 
any one of the others. Preferably, the amino acids are chosen fiom the 20 amino acids 
commonly found in naturally occurring proteins. 

Y', Y^ and Y^ can be any specified amino acid residue or combination of specified amino 

20 add residues, and each of the Ys, if present, can be tiie same or different fix)m any one of the 
others. Preferably, the amino acids are spacer amino adds \^ch will not sfgnificantiy interfere 
with the binding between the peptide sequence and a target molecule. It is preferable to use 
combinations of two or more anuno acids for the Y amino adds in a given library so as to reduce 
any limitations in the conformations of the displayed peptide that might be imposed by use of 

25 only one given amino acid. Most preferably, glycine and alanine residues are used in 

combination in the library. Glycine and alanine are small side chain amino acids that appear to 
act more as blanks than interfering contacts. In other embodimeiits, the Yaiiiino acids can be 
amino acids which are chosen because they do significantiy affect in some way the binding 
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between tbe peptide sequence and a target molecule. For example, glycine and cysteine residues 
can be used in combination, or alanine and cysteine residues can be used in combination. 

Y' and Y^- are, respectively c',<^ and c* amino add residues long, c'.c^andc'canbc 
the same or different tmm any one of the others. Preferably, each of c', c^ and c* is 0 to about 20, 
5 more preferably is 0 to about 1 0, even more preferably is 0 to about 6, and most preferably is 0 to 
about 4. 

For acanq>le, in an anchor library ^«iiere each of ti» c*s are 0 to 4, and flie. Y* s are a 
combination of glycine and alanine, the minimal structure of the peptide sequence is 4 amino 
adds long (vidi^ each of c', and is 0): 

ID 

and the maximal structure of tiie peptide sequence is 16 amino acids long (vdiere each of c', 
andc'is4): 

IS 

X'(G/AXG/AXG/AXG/A)X^G/AXG/AXG/AXG/A)X'(G/AXG/AXG/AXG/A)X*, 

^ere (G/A) is a glycine or alanine residue. This anchor library also contains all other in- 
between peraiutations of c, e.g., where c' is 0, c^ is 1 and c' is 1; where c" is 1, c^ is 1 and c' is 1; 

20 M*ere c' is 2, c^ is 1 and c^ is 1; etc. All possible permutations of alanine and glycine for each of 
the designated c values are also included in this anchor library. 

It is preferred that all possible peraiutations of the displayed sequence are present, that is, 
all combinations of c values and all combinations of, e.g., alanine and/or glycine, for each of the 
c values. In otiier embodiments, at least about 10* to about 10* permutations of all possible 

25 pennutations are present in tiie anchor library, or at least about 10* permutations of all possible 
permutations are present in the anchor library, or at least about 10* permutations of all possible 
permutations are present in the anchor library, or at least about 10* permutations of all possible 
permutations are present in tiie anchor library, or at least about 10' permutations of all possible 
permutations are present in tiie anchor library, or at least about 10' permutations of all possible 

30 peimutations are present in tiie anchor library, or at least about 10* permutations of all possible 

permutations are present in tiie anchor library. 
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In certain embodiments, tiie libraiy does not contain more than about 10% of displayed 
peptide sequences diffoent from tiie first mentioned displayed peptide sequences. In 
embodiments, the library does not contain more than about 1% of displayed peptide sequences 
different from the first mentioned displayed peptide sequences. And in yet other embodiments, 
5 the lilnary does not contain more than about 0.1% of displayed pqptide sequences different torn 
Hjs first mraiitioned di^layed peptide sequences. 

In certain onbodiments of Ifae invoition, flie disphiyed peptide can lutve additional units of 
XOOg. For example, it can have preferably about 1 to about 10 additional units, more preferably 
about 1 to about 5 additional units, and most preferably about 1 to about 3 additional units. In 
10 ofter embodiments, one or more additional units ofX alone or OOc alone can be present 

In yet otiier embodiments of the invention, Ae anchor Ubraiies described above can have at 
least one core binding sequence, denoted by B, of p amino add residues in loigth. B can be any 
size, e.g., from a single amino acid to the size of a gene. Preferably, p is about 1 to about 20, 
more prefeably p is about 4 to about 1 0, and most preferably p is about 6. By core binding 
IS sequaiceisineantapq;>tidesequa]ce\«^chisknowato1nndtoatargetm0^ Incortain 
embodimmts, the core binding sequoice is additional to the amino acid residues of tiie di^layed 
peptide sequences described above. In such libraries, the core binding sequence can be 
positioned on the NHj-terminal or COOH-terminal side of any of the X», X^, or X* amino acid 
residues, or on the NHj-teiminal or COOH-tenninal side of any of the Y, e.g., alanine or glycine, 
20 readues. In otiiCT embodiments, at least one oftiieX residues is replaced with the core biiiding 
sequence. In yet other embodiments, at least one of the Y residues, e.g., one of the alanine or 
glycine residues, is replaced with a core binding sequence. Inclusion of a known core tnndmg 
sequence in the anchor library allows identification of surrounding ammo add residues vMch are 
additional important contacts between the peptide and the target molecule. The invention thus 
25 allows identification of better binding sequeiK:es by identifying additional ammo adds 
surrounding tiie core binding sequence viuch in combination witii the known core bindmg 
sequence exhibit enhanced binding as compared to the known core binding sequence alone. 

In certain embodiments, more than one known binding sequence is present in each of the 
displayed peptide sequences of the anchor library. These multiple known binding sequences can 
30 be acyaceat to, or not adjacent to, each other, and can be identical to, or not identical to, each 
otiier. 
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In certam embodiments, the anchor libraries have at least one constraint imposed upon &e 
displayed peptide sequence. A constraint includes, e.g., a crosslink, a stacking interaction, a 
positive or negative charge, hydrophobidty, hydrophilicity, a structural motif and combinations 
thereof. In certain embodiments, more than one constraint is preserit in each of the displayed 
S pqpttde sequences oftfae anchor library. These multiple constraints can be adjacent to, or not 
adj ac^ to, each other, and can be id^cal to, or not identical to, eac^ other. 

A crosslink includes, e.g., a disulfide bond. In certain embodiments, the displ^ed peptide 
has at least one cysteine residue. The cysteine residue can be, e.g., additional to the amino add 
residues ofthe displayed pqptide sequences described above. In such libraries, the cysteine 
1 0 residue can be positioned on the NHs't^™^^ COOH-tmninal side of any of the X^ X^, ^ or 
amino acid residues, or on the NH2-temiinal or COOH-temiinal side of any of &e Y, e.g., 
alanine or glycine, residues. In other embodiments, at least one ofthe X residues is acysteine 
residue. In yet otiier embodiments, at least one ofthe Y residues, e.g., one ofthe alariine or 
glycine residues, is replaced \vith a (grsteirie residue. Multq>le<7stdiies can be present in each of 
IS the peptides so as to form potential disulfide boiidsivithin a ra^ Disulfide bonds can 

be formed within the displayed peptide sequence itself or between tiie displayed pqptide 
sequence and the target molecule. 

A stractural motif includes, e.g., a zinc finger formation, a leucine zipper, and a P-tum 
stracture in file peptide. The sequences asp gly or pro gly are likely to induce p-tums, either 
20 alone or in combination witii, e.g., a disulfide bond. 

In other embodiments, the anchor libraries can be constructed to have both a core binding 
sequence and a constmint. e.g., at least one cysteine residue. In one such embodiment, at least 
one of the X residues can be, e.g., either a cysteine or a glycine such tiiat the displayed peptide 
sequence is: 

25 

(C/G)(Y')ci(aG)(Y^)c2B(C/GXY^)c3(^^^^ 

where (C/G) is a cysteine or glycine residue. In such a library, multiple cysteines are present so 
as to form potential disulfide bonds within a random series. 
30 In yet other embodiments, the displayed peptide sequence comprises: 

xMy^)^xX^(Y^)^2X^(Y^)^3X^ 
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wherein each Y', and is alanine or glycine or a core binding sequence B of p amino acid 
residues in length or a combination of alanine and glycine or alanine and B or glycine and B. 
And in yet other embodiments, Ae displayed peptide sequence comiJtises: 

viiierm each Z^ Z^, and Z* is an amino add leadue or a core binding sequence B of 
5 acidtesiduesinlengthandanyofZ',Z*,Z»andZ^canbefliesameordifferentftomanyone 
other, and wherein Z' and Z* are each attadied to an amino add residue that flanks the displayed 
peptide sequence. 

Other embodiments include anchor libraries constructed with other configurations of 
ccimbinations between X residues and/or Y readues and/or B sequences and/OT 

10 and/or othor constraints, as is obvious to those skilled in art 

The invention also includes a method of making the anchor libraries described above. A 
coUection of micldc add sequences is synthesized and inserted into vectors to give recombmant 
vectors. These recombinant vectors are introduced into a host The host having the recombinant 
vectors is propagated so as to result in a collection of recombmant vectors, ead» of Ae 

15 recombinant vectors having a nucldc acid sequence fiom flie coUection of nucleic acid sequences 
which encodes a displayed peptide sequence. The peptide sequence is any of the peptide 
sequences discussed above. e.g., X'(Y')c.X2(YV^(Y^X^ with or without at least one core 
binding sequence, and with or without at least one constraint, e.g., a cysteine residue. In certain 
embodiments, at least about 10* to about 10» peraiutations. or about 10* permutations, or about 

20 105 permutations, or about 10* permutations, or about 10^ permutations, or about 10« 

permutations, or about lO'pennutations, of all possible permutations of the displayed peptide 
sequence are present in the anchor library. In other embodiments, the Ubrary does not contain 
more than about 10%, or more than about 1%, or more than about 0.1%» of displayed peptide 
sequences different fiom the first mentioned displayed pq)tide sequences. 

25 The nucleic acids that encode the anchor Ubrary can be obtained by any method vMch 

produces the requisite permuted nucleic acids. For example, a split synthesis procedure can be 
used. SfiS. g&. Cormack and Struhl, Science 262:244-248 (1993). Examples 1 and 3 describe 
examples of using spUt synthesis to make nucldc add inserts fi)r andior Ubraries. 
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The invention further includes a method of using die anchor libxaries described above to 
idratify a peptide sequence tiiat binds to a target An anchor library havmg a collection of 
recombinant vectors, each of \^ch has a nucleic acid sequence v^ch encodes a displayed 
peptide sequence, is provided The displayed peptide sequ^ice can be any of the peptide 
5 sequences discussed above, e.g., X'(Y%iX?(Y^c2X^(Y^c?X^ with or without at least one core 
binding sequence, and with or without at least one constiamt,e.g^ Expression 
and display offhepq)tide sequence is permitted. The anchor library is contacted widi the target 
under conditions in vMch tiie displayed peptide sequence binds to the target, and the displayed 
peptide sequmce wfaidi binds to the.target is identified 

10 Target is meant to include any molecule with whidi the dispk^ed pq>tide sequmce will 

bind Targets include, e.g.,proteinaceous and non-protdnaceous molecules. Examples of targets 
are ligands, receptors, hormones, cytokines, antibodies, antigens, enzymes, en^me substrates 
and viruses. In some cases, the binding pq>tide modulates the original activity of the targ 
molecule, and therefore can be useful as a drug. The target includes, e.g.,dn:^ antagonists and 

IS agonists. The binding pqddes can be used, e.g., for diagnostic or tfaerq)^ 

The contacting step can be done by any me&od ini^ch the displayed peptide sequence 
will bind, direcfly or indirectiy, to the target These mefiiods include, e.g., scre^ and 
selections. Preferably, an affinity purification method is used A£5iiity purification includes, 
e.g., biopanning. For example, a phage anchor library having displayed peptide sequences is 

20 mixed witii biotinylated target, resulting in phagerbiotinylated target complex if a displayed 
peptide sequence binds to the target The mixture is added to a streptavidin coated substance, 
e.g., beads or a petri plate. The resulting biotin-streptavidin bond allows isolation of the phage 
carrying peptide sequences that bind to the target. It is preferable to do multiple rounds of 
biopanning to reduce background See Example 2. 

25 Idmtification of the displayed peptide sequence includes, e.g., determining the sequence of 

amino acids that comprise the peptide. Identification can be accomplished, e.g., by anq>lifyiiig 
the recombinant vector which has the nucleic acid sequence which encodes for the displayed 
peptide sequence which binds to the target, and sequencing the nucleic acid sequence by standard 
procedures known in the art to determine the displayed peptide sequence which binds to tibe 

30 target If desired, tiie peptide thus identified can be synthesized using standard procedures 

known in the art and furtiier tested for its ability to bind to the target in vitro and/or in cell-based, 
and/or animal models. See Example 2. 
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In a giv«i anchor library, the ability to determine essential amino acid contacts between 
the displayed peptide and a target molecule is aided by the ability to observe conserved amino 
arid lesidoes in the different disptoyedpqrtideswMch are able to bind to the targ^ Conserved 
amino acid residues arc meant to include different DNA codons for flie same amino add or 

5 different DNA codons for functionally similar amino adds. The consensus is determined by 
' con^jaringAe sequence ofthe individual clones obtained ftom a library screen. It is preferable 
that Ae library haw sufSdent coiiq)lexity m order to oteerve sudi a consensie. 

Also induded in the mvention is apeptide identified by use of any of flic andior libraries 
described above in which the pq)tide is useful as a diagnostic or fterapeutic product m Hat lbs 

10 pqptide is able to bind to a target molecule ^ch is mvolved in a physiological process. For 
example, the target molecule can be a receptor involved in inflammatio n, eg., IL-1, or in prostate 
cancer, e.g., GnRH; or the target molecule can be an enzyme, eg., a protease, e.g., HIV protease. 
By binding to these or otfier target molecules that are involved in various abnormal conditions or 
diseases, the binding peptides of ftis invention modulate the origmal activily of Ae target 

15 molecule and are therefore useful as diagnostic or therapeutic products. 

The invoition also includes a library vtdiich has a collection of nucldc add molecules 
encodmg peptides having random amino adds, the in^vement conqnismg a library in which 
the random amino acids are not continuous so that the amino adds in the peptide that are 
important contacts for interaction between the peptide and a target molecule can be identified. 

20 The invention also includes a library having a collection of nucldc add molecules 

encoding pq)tides having random amino acids, tiie improvement comprising nucleic add 
molecules encoding alanine or glycine or a combination of alanine and glydne residues in 
varying numbers acting as spacers between die random ammo adds so that amino add residues 
in a peptide that are important contacts for interaction between the pq»tide and a target molecule 

25 can be identified. 

The mvention fimher provides a collection of recombinant DNA molecules encoding 
peptide sequences having a plurality of different bmding domains. The peptide sequences 
comprise: X'(Y')c.XWc2X'(Y5)c3X*. wherein eachX", X*, X? and X* is an amino add residue 
and any of X', X^, X' and X" can be tiie same or different ftom any one otiier, wherdn each Y', 

30 and is alanine or glycine or a combination of alanine and glycine tiiat is respectively c', c^ 
and c» amino acid residues long and any of Y», Y* and Y' if present can be the same or difBerent 
ftom any one other, wherein each of c», c^ and c' is 0 to about 20, wherein X' and X* are each 
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attached to an amino acid residue that flanks the peptide sequence, and herein at least about 1 0^ 
to about 10^ pCTnutations, or about 10^ pennulations, or about 10^ pranutations^ or about 10^ 
pennutations, or about 10^ pennutations» or about 10^ pennutations, or about 10^ permutations, 
ofaU possible permutations ofthe peptide sequence are present in the c^^ In other 

5 embodiments, tiie collection does not cmitain more than about 10%, or more &an about 1%, or 
more tfian about 0.1%^ of displayed peptide sequences different firom tiie first mentioned 
displayed peptide sequences. In certain embodiments, die pq)tidesequ^ices are displayed on the 
sur&ce of a biological mateiial, e.g., a virus^ phage, cell, spore or gene product 

The invention also includes a recombinant filamentous pha^ having a displayed peptide 

10 sequence ivitfa known binding properties. The displayed pq>tide sequence is foreign to tiie 
filamentous phage. The displayed peptide sequence comprises: X*(YOc» X^O^tfX^O^c^ X^ 
wherein eachX*, X^, andX^ is an amino acid residue and any of X^X?, X^ and X^ can be the 
same or different &om any one other, wherein each and is alanine or glycine or a 
combination of alanine and gfydne diat is respectively c\c? and c^ amino add residues long and 

1 S any of Y^ Y^ and Y^ if present can be the same or dififerrat fiom any one other, whodn each of 
c', c^ and c^ is 0 to about 20, wherein X' and X"^ are each attached to an amino acid residue that 
flanks the displayed peptide sequence, and wherein the displayed peptide sequence is able to 
bind to a target In cemin embodiments, at least one of Y^ Y^ and Y^ is at least about 20 amino 
acid residues long, preferably is at least about 10 amino acid residues long, more preferably is at 

20 least about 6 amino acid residues long, even more prefoably is at least about 4 amino add 
residues long, more preferably yet is at least about 3 amino acid residues long, more preferably 
yet is at least about 2 amino acid residues long, and most preferably is at least about 1 amino add 
residue long. 

The invention also includes a recombinant vector having a nucldc add sequaice inserted 
25 in a gene. The nucleic acid sequence encodes a di^layed pqptide sequence having known 
binding properties. The displayed peptide sequence comprises: X*(YOc> X^(Y^tf X^(Y^c5X^ 
v^erein each X^ , X^ and X"* is an amino acid residue and any of X\ X?, X^ and X^ can be the 
same or different firom any one other^ wherein each Y\ Y^ and Y^ is alanine or glycine or a 
combmation of alanine and glycine that is respectively c\ c^ and c^ amino add residues long and 
30 any of Y^ Y^ and if present can be the same or different firom any one other, wherein each of 
c', c^ and c^ is 0 to about 20, wherein X^ and X"^ are each attached to an amino add residue tiiat 
flanks die displayed peptide sequence, and wherein the displayed peptide sequence is able to 
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bind to a tax^et In certain embodiments, at least one of Y\ and is at least about 20 amino 
acid residues long, preferably is at least about 10 amino add residues long, more preferably is at 
least about 6 amino acid residues long, cvea more preferably is at least about 4 amino add 
lesidues long, more preferably yet is at least about 3 amino add residues long, more preferably 
S yet is at least about 2 amino add residues long, and most preferably is at least about 1 aminoacid 
residue long. 

The invention also includes a recombioant nucleic add molecule having a nucleic add 
sequence inserted in a gene. The nucldc add sequence encodes a displayed peptide sequence 
having known binding properties. The displayed pqptide sequence comprises: X^(YOc> ^OOtf 

10 X^(Y^)c3X^ vrfierein eachX^ X^,X^ and is an amino acid residue and any of X', 10, X^ and 
X^ can be the same or different from any one other, wherdn each and Y^ is alanine or 
glydne or a combination of alanine and glydne that is respectively c^ c^ and c^ amino add 
residues long and any of Y^ Y^ and Y^ if present can be the smne or different fiom ai^ one other, 
wfaeieLn each of c^ and c^ is 0 to about 20, i^eiein X^ and X* are each attached to an amiiK) 

1 5 add residue that flanks the displayed peptide sequence, and i^iierein the displayed peptide 
sequence is able to bind to a target In cotain raibodimeots, at least one of Y^ Y^ and Y^ is at 
least about 20 amino acid residues long, in:eferably is at least about 10 amino add residues long, 
more preferably is at least about 6 amino add residues long, more preferably is at least about 4 
amino add residues long, more preferably yet is at least about 3 amino add reddues long, more 

20 preferably yet is at least about 2 amino acid residues long, and most preferably is at least about 1 
amino add residue long. 

The invention further includes a recombioant protein having a displayed peptide sequence 
having known binding properties. The displayed peptide sequence comprises: X'(YOc'X^(Y%i 
X^(Y^)c3 X^ wherem each X' , X^, X^ and X^ is an ammo acid residue and any of X\ X^, X^ and 

25 X* can be the same or different from any one other, wherein each Y^ Y^ and Y^ is alanine or 
glycine or a combination of alanine and glycine tiiat is respectively c^, c^ and c^ amino add 
residues long and any of Y', Y^ and Y^ if present can be ^e same or different fiom any one other, 
v^mineachofc\c^andc^is0toabout20,vdiereinX' and X^ are each attached to an amino 
acid residue that flanks the displayed peptide sequmce, and wherein fee displayed peptide 

30 sequence is able to bind to a target In certain embodiments, at least one ofY',Y^ and Y^ is at 
least about 20 amino acid residues long, preferably is at least about 1 0 amino add residues long, 
more preferably is at least about 6 anuno add residues long, even more preferably is at least 
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about 4 amino acid residues long, more preferably yet is at least about 3 amino acid residues 
long, more preferably yet is at least about 2 amino acid residues long, and most preferably is at 
least about 1 amino addresidue long. 

5 gXAMPlfES 

Example 1 : Constmction of a Phage Anchor Library 

This exanrple illustrates the construction of a phage anchor library having random amino 
10 add codoxis distributed tiuougbout a domain of alanine arid/or Standard cloning 

techniques known to those skilled in the art used. 

(a) Vector Preparation 

30 \ig of Fuses (Smith and Scott, Methods in Enzymology 217:228-257 (1993)) was 
1 5 cleaved with 200 units of ^donuclease Sfi I ia 500 ^l of NEB #2 restriction bufTer for 1 0 hours. 
The reaction was terminated with addition of 15 mM EDTA, followed by phenol and chloroform 
extractions. The DNA was recovered by isopropanol precipitation, resuspended in 500 ^1 of TE, 
and recovered by EtOH predpitatioiL 

20 (b) foseyt Preparation? 

The anchor insert used in the library was synthesized as a single stranded oligoma using 
split synthesis. Seeae£,,Cormack and Stnihl, Science 262:244-248 (1993^ This process creates 
combinations of sequences vAdch differ from each other. 

Using split synthesis, five templates were synthesized and nuxed three times to produce 
25 tiie anchor library: 
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1) GGGCTGCCGGGNNKNNK 

(Seq.IDNo.l) 

2) GGCTGCCGGGNNKGSNNNK 

(Seq.IDNo.2) 

3) GGGCTGCCGGGNNKGSNGSNNNK 

(Seq.IDNo.3) 

4) GGGCTGCCGGGNNKGSNGSNGSNNNK 

(Seq.IDNo.4) 

5) GGGCTGCCGGGNNKGSNGSNGSNGSNNNK 

(Seq.IDNo.5) 



6) NNK 

7) GSNNNK 

8) GSNGSNNNK 

9) GSNGSNGSNNNK 

(Seq.IDNo.6) 

10) GSNGSNGSNGSNNNK 

(Seq.IDNo.7) 

11) NNKGGTGGTGCTGCTG 

(Seq.IDNo.8) 

12) GSNNNKGGTGGTGCTGCTG 

(Seq.IDNo.9) 

13) GSNGSNNNKGGTGGTGCTGCTG 

(Seq.IDNo. 10) 

14) GSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq.ro No. 11) 

15) GSNGSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq.ro No. 12) 

N = equal mix of G, A, T, C 
S = equal mix of G, C 
K= equal mix of G^ T 



COMBINE AND SPLIT 



COMBINE AND SPUT 



COMBINE 
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DNA was chemically syndiesized such that column 1 contained tJhe DNA sequence 
GGGCTGCCGGG (Seq. ID No. 13), followed by DNA encoding a random amino add, NNK, 
followed by DNA mcoding a second random amino add, NNK. Column 2 encoded the DNA 
sequence GGGCTGCCGGG (Seq. ID No. 13), followed by a random amino acid codon, NNK, 

5 followed by eillier a glycine or alanine codon, GSN, and then followed by a random amino add 
codon,NNK. Columns 3, 4 and 5 encoded the DNA sequence GGGCTGCCGGG (Seq. ID No. 
13), followed by a random amino add codon, NNK, Mowed by, leqsectively, 2, 3 and 4 glycine 
and/or alanine codons, GSN, and then followed by a random amino add codon, NNK. 

After symfaesis of columns 1 -5, the leans fiom the five cohmms were mixed, resulting in a 

10 pool of oligomers ^ch contained two random amino adds sqwrated by 0 to 4 glydne and/or 
alanine residues. This entire mature was then ^lit into 5 new cohmms, denoted 6-10. Eachof 
these columns was subjected to further DNA synthesis, resuhing m, respectively, codons for 0, 1, 
2, 3 and 4 glycine and/or alanine readues, GSN, followed by a random amino add, NNK. 
Because Ae additions of cohmms 6-10 were conducted on a mixture of resins from colunms 1-5, 

15 die mixture ofcohrams 6-10 resuhed in oligomer ftataU have three random amhiD adds, sudi 
that the ndghboring random amino adds are separated by Oto4 glycine and/or alanme readues. 

One additional round of split synthesis was undertake in v/bidi the mixtures of colunms 
6-10 were extended wth 0 to 4 glycine and/or alanine residues, GSN, and one more additional 
random amino add, NNK, followed by the sequence GGTGGTGCTGCTG (Seq. ID No. 14). 

20 The final mixture of diese colunms resuhed in a series of oligomers wifli four random amino 
adds such that the neighboring random amino acids arc separated by 0 to 4 glydne and/or 
alanine residues. 

Two additional oligomers, pins, CCCGGCAGCCCCGT (Seq. ID No. 15) and 
CACjCACCACC (Seq. ID No. 16), were synthesized vMch hybridize to the anchor oUgomers so 
25 as to reconstruct double stranded DNA near the tennini of Ae insert with three smgle strand 

nucleotide overhangs corresponding to Sfi I ovohai^: 

The insert and pin oligomers were kinased at 10 ng/30 \d kinase bufifer fiom NEB with 1 
mM ATP at 37**C for 30 minutes, followed by inactivation at 68**C for 5 minutes. The anchor 
oligomer was annealed to the pin oligomers in 500 mM Nad, 50 mM Tris pH 7.5 at 68'*C for 10 
30 minutes and cooled to room tenq)crature over 30 mmutes. Each of flie oligomers was at 5 nM 
during the annealing. 

It is noted that similar results can be obtained with other 5' and 3' flanking sequences on 
the anch(nr inserts, and with other corresponding pin sequences ahaed qipropriately, as can be 
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diosoi by those skilled in the art Moreover, other restnction sites can be used as {qppropriate for 
any given vector, as is known to those skilled in the art 

(c) Vector Ligation 

30 ng of DNA vector was lifted to assembled insert at 5 jig^ol vector and three-fold 
excess assembled insert in NEB ligation buffer with 100 units of T4 DNA ligase at 10"C for 16 
hours. DNA was purified fiom ligation bufifer by phenol and diloroform extractions, followed 
by EtOH prbdintation and resuspCTsion in TE. 

(d) nMA TTansfonnation 

' DNA was transfonned into MC1061 (Wertman et al.. Gene 49:253-262 (1986)) 
electrocompetmt cells using 0.5 \ig of DNA per 100 ^1 of cells using 0.2 cm electroporator cells 
and a BioRad electroporator set at 25 jiF, 2 J KV and 200 ohms. Shocked cells were recovered 
in SOC media, grown out at 37°C for 20 minutes and inoculated mto LB containing 20 ugAnl 

tetracycline. 

(e) T .ihrarv Phage Isolation 

Phage released fiom transformed cells were isolated after growing for 16 hours. Phage 
were separated fiom cells by caitrifiigation at 4*'C at 4.2K for 30 min. In a Beckman J6, 
followed by a second centrifiigadon of Ac supernatant at 4.2K for 30 miiL Ph^ were 
precipitated with fte addition of 1 50 ml at 1 6.7% PEG/3.3 M NaCl per Kter of supernatant 
Mixed solutions were incubated at 4*'C for 1 6 hours. Precipitated phage was collected at 4.2K in 
a J6 followed by resuspension in 40 ml of TBS. Resuspended phage were precipitated again 
with the addition of 4.5 ml of PEG solution for 4 hours. Phage were collected at 5K in a 
Beckman JA20 at 4'»C. Phage were suspended in 7 ml of TBS and brought to 1 .3 mg/ml daisity 
by the addition of 1 gm of CsCl per 2.226 gm of aqueous solution. Phage were sul^ected to 
equihT)rium centrifogation in a type 80 rotor at 45K rpm for 40 hours. Phage bands were 
isolated, diluted 20 fold with TBS and pelleted at 40 K in a Type 50 rotor. Pellets were 
resuspended in 0.7 ml of TBS and used as is for biopanning at qjproxunately 3 x 10" phage/ml. 

Example 2: Rinpanning tn Select for P eptide Binding Sequences 

This example illustrates biopanning of the phage library obtained fiom Example 1 to selert 
for displayed peptide sequences that bind to Hotinylated IHB. The jAage act as afSnity- 
selectable vectors in tiiat the displayed peptide binds specifically to inunobilized IL-IB if the 
library contains a displayed peptide that can so interact with DL-IB. 
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(a) Rending 

Biotmylated IL-1 (b-BL-l) (Yew et aL, JBC 264(30):17691-17697 (1989)) is incubated 
with 1x10" phage in 20 |il of TBS for 20 minutes at 22^C The phage:(b-IL-l) complex is 
isolated firom fiee phage by addition of streptavadin coated paramagnetic beads for an additional 
5 lOminutes. Magnetic beads are coUected by attraction with a magnet and wash^ 

containing 0.5% Twe^20 for atotal of? washes over 30 minutes. The remaining phage that 
are bound to the beads (by way of b-IL- 1 binding to streptavadin) are recovered by elution wi& 
lOO^lof 100 mM glycine pH 2.2 for lOminutes. Eluted phage are neutralized with 1 MTris 
base. 

10 (b) Amplification 

Eluted phage are amplified by infection into log phase K9 1 £^ ££di C^yons and Zinder, 
Virology 49:45-60 (1972); Smith and Scott, Methods m Enzymology 217-.228-257 (1993)) at an 
moi of 0.0001. Approximately 10^ phage are amplified by plating on 10 LB agar petri dishes 
containing 20 ^ml tetracycline. The pb^e released ftom infected cells, approximately 10^^ 
15 phage, are harvested by washiiig the LB agar plates with I£, and pm 
two PEG predpitadons and resuspended at 10*^ phages 

Amplified phage are further subjected to two additional roimds of Uopaiming usrog die 
binding and amplification conditions described above. 

(c) Sequencing Inserts 

20 After three rounds of biopanning, individual phage are isolated and sequraced to reveal the 

DNA sequence that encodes for the displayed peptide in the selected phage. Sequencing is done 
according to manufacturer's protocol for Sequenase 2.0 (United States Biochemical, Cleveland, 
OH 44122). 

(d) Peptide Synthesis 

25 Peptides representing affinity purified phage are synthesized C^esearch Genetics, 

Huntsville, AL 35801) and tested for their ability to bmd IL-1 and effect IL-1 binding to IL-1 
receptor in cell based and animal models. Slack et al., Biotechniques 10:1 132-1 138 (1989). 

Example 3 : Construction o f a Phage Anchor Library Having Codons For a Known Core 
30 Peptide Binding Sequence 

This ^Kample illustrates construction of a phage anchor library ^ch has codons for a 
known core peptide binding sequence which binds to a target molecule, surrounded by random 
amino acid codons distributed throi^out a domain of random alanine and/or glycine codons. 
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Construction of tliis type of library is similar to lliat described in Example 1, except tttat &e 

oligomer constructs not only have tiie random amino acid codons and glycine and/or alanine 

codons, but also have nucleic add sequences ^^ch code for a kno^ core peptide binding 

sequence, denoted as B: 

1) GGGCTGCCGGGNNKNNK 
(SeqE>No.l) 



10 



15 



2) GGGCTGCCGGGNNKGNNNK 

(Seq.©No.2) 

3) GGGCTGCCGGGNNKGSNGSNNNK 

(Seq.IDNo.3) 

4) GGGCTtjCCGGGNNKGSNGSNGSNNNK 

(Seq.IDNo.4) 

5) GGGCTGCCGGGNNKGSNGSNGSNGSNNNK 

(Seq.IDNo.5) 



20 6) BNNK 

7) BGSNNNK 

8) BGSNGSNNNK 

25 

9) BGSNGSNGSNNNK 

(Seq.IDNo.6) 

I0> BGSNGSNGSNGSNNNK 
30 (Seq.IDNo.7) 

11) NNKGGTGGTGCTGCTG 
(Seq.IDNo.8) 

35 12) GSNNNKGGTGGTGCTGCTG 

(Seq.IDNo.9) 

13) GSNGSNNNKGGTGGTGCTGCTG 
(Seq.IDNo. 10) 



40 



45 



14) GSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq.IDNo.ll) 

15) GSNGSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq.IDNo. 12) 



COMBINE AMD SPLIT 
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The anchor libiaiy can also be constructed such that sequence B is located, e.g., before or 
after any of the ofher NNK or GSN codons. 

OthCT anchor libraries, containing additions or substitutions of nucleic acid sequences, can 
be constructed using similar methods- For example, codons for cysteine, or any other specified 
S amino add or sequence of amino adds, can be substituted for the nucleic add sequence coding 
for the core binding sequ^ace Bin the above-described split synthe Anchor libraries 
containing two or more core bindmg sequences, cysteines, or any other specified amino add or 
^ sequence of amino adds, also can be constructed usmg similar procedures as desoibed, except 
that the multiple additions are ^thesized as part of the oligomers at multiple positions, e.g., 
10 each can be located before or after any of the NNK or GSN codons, as can 
skilled in the art 

Fvam plft4r Four Amino A dd Residues in a Peptide Is SufiSdent For Binding to a Target 

1 s This example illustrates tiiat four ammo add residues in a pqptide are sufBdent for 

micromolar binding between tiie peptide and its target 

A hexamer phage library was constructed essentially as described for die anchor libraries, 
except the oligonucleotide was: 

GGGCTGCCGGGNNKNNKNNKN^^ (Seq. ID No. 18). The 

20 Ubraiy was screened against an antibody to hCG by biopaiming as described in Example 2. The 
phage tiiat bound to the antibody contained the consensus sequmce XaaThrProTrpXaaGln (Seq. 
ID No. 17), where X was not absolutely specified. Peptides were synthesiased which 
corresponded to the identified sequences and the flanking amino adds found in the phage. These 
peptides had an IC50 of 4.5 jiM compared to 10 nM for hCG. IC50 is equal to die concentration 
25 of pqptide necessary to prevent 50% of hCG-I^^ fiom binding to the antibody. Therefore^ four 
amino acid residues were sufficient to result in \iM binding. 

Those skilled in the art will be able to ascertain, using no more than routme 
experimentation, many equivalents of the specific emlxxliments of tiie invention described 
30 h^ein. These and all other equivalents are intended to be encompassed by the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 ©APPLICANT: 

(A) NAME: PHARMACEUTICAL PEPTIDES, INC. 
O) STREET: ONE HAMPSHIRE STREET 
(Q CITY: CAMBRIDGE 

(D) STATE: MASSACHUSETTS 

10 (E) COUNTRY: UNITED STATES OF AMERICA 

(F) ZIP: 02139-1572 

Cii) TITLE OF INVENTION: ANCHOR LIBRARIES AND IDENTinCATION OF 

PEPTIDE BINDING SEQUENCES 

15 

Ciii) NUMBER OF SEQUENCES: 18 

Ov) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Wol^ Greenfidd & Sacks. P.C. 
a> (B) STREET: 600 Atiantic Avenue 

(C) CITY: Boston 
(P) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02210 

25 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(Q OPERATING SYSTEM: PC-DOS/MS-DOS 
30 (D) SOFTWARE: Patentin Release #1.0, Veisian #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

35 (C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/479,660 

(B) FILING DATE: 07-JUN-1995 

.40 

(vui) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Greer, Helen 

(B) REGISTRATION NUMBER: 36,816 

(C) REFERENCE/DOCKET NUMBER: P0567/7000WO 

45 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 720-3500 

(B) TELEFAX: (617) 720-2441 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: ' 
(A) LENGTH: 17 base pairs 
5 (B) TYPE: nncldc add 

(C) STRANDEDNESS: smgle 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

10 

GGGCTGCCGGGNNKNNK 17 



(2) INFORMATION FOR SEQ ID N0:2: 

15 

0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucldc add 

(Q STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ad) SEQUENCE DESCRIPTION: SEQ IDN0:2: 

GGGCTGCCGGGNNKGSNNNK 20 

25 

(2) INFORMATION FOR SEQ ID N0:3: 

0) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 23 base pairs 

(B) TYPE: nudeic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

GGGCTGCCGG GNNKGSNGSN NNK 23 



40 (2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nudeic add 

45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
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GGGCTGCCGG GNNKGSNGSN GSNNNK 



(2) INFORMATION FOR SEQ ID N0:5: 

5 

CO SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xO SEQUENCE DESCRIPTION: SEQ ID N0:5: 

GGGCTGCCGG GNNKGSNGSN GSNGSNNNK 

IS 

(2) INFORMATION FOR SEQ ID N0:6: 

0) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
GSNGSNGSNNNK 



30 (2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 5 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(3d) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

40 GSNGSNGSNGSNNNK 



(2) INFORMATION FOR SEQ ID N0:8: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

NNKGGTGGTG CTGCTG 

5 (2) INFORMATION FOR SEQ ID N0:9: 

0) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
O) TYPE: nucleic add 
10 (Q STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xO SEQUENCE DESCRIPTION: SEQ ID N0:9: 

15 GSNNNKGGTGGTGCTGCTG 



(2) INFORMATION FOR SEQ ID NO:10: 

20 0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(xO SEQUENCE DESCRIPTION: SEQ ID NO:10: 
GSNGSNNNKG GTGGTGCTGC TG 

30 

(2) INFORMATION FOR SEQ ID N0:1 1: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1 1 : 

40 

GSNGSNGSNN NKGGTGGTGC TGCTG 



(2) INFORMATION FOR SEQ ID NO: 12: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 



WO9fiA41180 .27- PCTAJS9ti/09383 

(D) TOPOLOGY: liiaear 
(xi) SEQXJENCE DESCRIPTION: SEQ ID N0:12: 
5 GSNGSNGSNGSNNNKGGTGGTGCTGCTG 28 



(2) INFORMATION FOR SEQ ID N0:13: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucldc add 

(Q STKANDEDNESS: sii^e 
(P) TOPOLOGY: linear 

IS 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 
(GGGCTGCCGGG U 

20 

(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
25 (B) TYPE: nucleic acid 

(C) STTtANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

30 

GGTGGTGCTG CTG 13 



(2) INFORMATTON FOR SEQ ID N0:15: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 

CCCGGCAGCCCCGT 14 

45 



(2) INFORMATION FOR SEQ ID N0:16: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(jd) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
CAGCACCACC 

10 

(2) INFORMATION FOR SEQ ID Nai7: 

0) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino adds 
15 (B) TYPE: amino add 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

20 XaaThrProTipXaaGln 
1 5 



(2) INFORMATION FOR SEQ ID N0:18: 

25 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
. (B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 



GGGCTGCCGG GNNKNNKNNK NNKNNKNNKG GTGGTGCTGC TG 42 

35 
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CLAIMS 

1 . An anchor libzaiy, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucldc acid sequence inserted in a 
5 gene, said nucleic acid sequence encoding a displayed peptide sequence, 

said displayed pq)tide sequrace of each of said vectors comprising 

\p^dierein each X^ X;^, X? and is an aniino add residue and 
same or different fix)m any one o±er, wherein 

combination of alanine and glydne &at is respectively c^ , c^ and c' amino add residues long and 
10 any of Y^Y^ and Y^ifpresent can be Ae same or different fix>m any one o&er,A^ 

c*, c^ and c^ is 0 to about 20, wherein X^ and X"* are eadi attached to an amino add residue fliat 
flanks said displayed peptide sequence, and 

wherem at least about 10^ to about 10' permutations of all possible p^mutadons of said 
displayed peptide sequence are present in said anchor library. 

IS 

2. The library of claim 1 v^^ierein said library does not contain more than about 10% of 
displayed peptide sequences different from said first motioned displayed peptide sequences. 

3. The library of claim 1 wherein said library does not contain more than about 1% of 
20 displayed peptide sequences different fit>m said first motioned displayed peptide sequences. 

4. The library of claim 1 wherein said hbrary does not contam more than about 0-1% of 
displayed peptide sequences different fipom said first mentioned displayed peptide sequences. 

25 5. The library of claim 1 wherem at least about 10^ permutations of all possible 
permutations of said displayed peptide sequence are present in said anchor library. 

6. The library of claim 1 wherein at least about 10^ permutations of all possible 
permutations of said displayed peptide sequence are present in said anchor library. 
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7. The libraiy of claim 1 v^toein at least about 10' permutations of all possible 
petmutatioos of said displayed peptide sequence are present in said anchor lifaraiy . 

8. The library of claim 1 v^ile^ein at least about 10^ permutations of all possible 
5 permutations of said displayed pqrtide sequence are present in said anchor library. 

9. The library ofclaimlv^ioein said vector is selected finmtitegroiq) consisting of a 
virus, phage, plasmid and cosmid. 

10 10. The library of claim Ivi^ierein said vector is a filamoitous phage. 

1 1. The library of claim 10 vAisxem said gene diat said nucleic add sequence is insoted in is 
a coat protein gene of said filamentous phage. 

15 12. The libraiy ofdaim 10 \idierein said goie that said nucldc add sequence is inserted in is 
a filamentous phage gene sdected from die groiq> conasting of gene m and gene Vm. 

1 3. The library of claim 1 0 wherein said gene that said nucleic add sequence is inserted m is 
selected from the groiq> consisting of diioredoxin, stai^uclease, lac repressor, gal4 and an 

20 antilxxty. 

14. The library of claim 1 wherein said displayed peptide sequence is displayed on the 
sur&ce of a virioiL 

25 1 5- The library of claim 1 wherein said displayed peptide sequence is displayed on the 
surface of a cell. 

16. The library of claim 1 wherein said displayed peptide sequence is displayed on the 
sur&ce of an expressed gene product 

30 

17. The library of claim 1 vrtierdn each of said c', c^ and c^ is 0 to about 10. 



18. The library of claim 1 \<diereineachofsaidc',<^andc?is0toabout6. 
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19. The library of claim 1 wherein each of said c', c^ and c^ is 0 to about 4. 



20. The library of claim 1 furiher comprising about 1 to about 10 additional imits of X(Y)c. 

5 2L The library of claim 1 wherein said displayed peptide sequence is not identical to a 
naturally occurring pqptide sequence. 

22. The library of claim 1 wfaoiein said displayed pqjtide sequence is idratical to anaturally 
occurring peptide sequence. 

10 

23. ^ Thelibrary of claim 1 whwein said displayed peptide sequen 

one said B bdng a core bmdiug sequence of p amino acid residues in lengttL 

24. The library of claim23 wfaa:einp is about 1 to about 20. 

15 

25. Thelibraryofclaim23 wherein pis about 4 to about 10. 

26. The library of claim 23 wherein p is about 6. 

20 27. The library of claim 23 wherein said B is selected from the groi^ consisting of said B 
being on the NHj-terminal side of any of said X\ X?, or amino acid residues, said B bemg 
on flie COOH-terminal side of any of said X'. X?, X^ or X^ amino add residues, said B being on 
the NH2-tenninal side of any of said alanine or glycine residues, and said B being on the 
COOH-terminal side of any of said alanine or glycine residues. 

25 

28. The library of claim 23 wherein more than one said B is present 

29. The library of claim 28 wherein said Bs are adjacent to each other. 

30 30. The library of claim 28 wherein said Bs are not adjacent to each other. 



31. The library of claim 28 \^erein said Bs are identical to each other. 
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32. The libiaiy of claim 28 i;^erem said Bs are not identical to each other. 



33. The library of claim 1 wfa^iein said displayed peptide sequence further comprises at least 
one constraint selected from the group consisting of a crosslink, a stacking int^action, a positive 

5 or negative charge» hydrophobidty, hydrophilidty » a structural motif and combinations thereof 

34. The library of claim 33 iwfaerein said crosslink is a disulfide bond. 

35. The library of claim 1 v^erein said displayed pqrtidesequencefurther.ro 
10 one cysteine residue. 

36. The lil»ary of claim 35 wherein said cysteine residue is selected fi»m the group 
consisting of said cysteine residue being on the NHj-tenninal side of any of said X*, 3?, or X^ 
amino acid residues, said cysteine residue being on the COOH-tenninal side of any of said X^ 

15 X^,X^ or X* amino arid residues, said cysteine readuebdng on Ael^ 

said alanine or glycine residues, and said cysteine residue being on the COOH-teiminal side of 
any of said alanine or glycine residues. 

37. The library of claim 1 wherein at least one of said X', X^, X^ or X* residues is a cysteine 
20 residue. 

38. The library of claim 1 further comprising at least one B, said B being a core binding 
sequence of p amino acid residues in length, and at least one cysteine residue. 

25 39. The library of claim 38 v^erein said displayed peptide sequmce comprises: 

(C/G)(Y»)c.(C/G)(Y2)c2B(C/G)(Y^c3(C/G) 
wherein (C/G) is a cysteine or glycine residue. 

30 

40. The library of claim 1 \^erein die coniplexity of said library is at least about 10^. 
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41 . An anchor libraiy, comprising: 

a collection of recombinant vectors, 

eadi of said recombinant vectors having a nucleic add sequence inserted in a 
gene, said nucleic acid sequence ^coding a displayed peptide sequence, 
5 said displayed peptide sequence of each of said vectors comprising 

wherein each X^X^,X' and X"* is an amino add residue and any of X^^ X^, X^ and X^ can be the 
same or different fix)m any one otfier, wherein each , Y^ and Y^ is alanine or glycine or a 
combination of alanine and glydne that is respectively c^ and c^ amino add residues long and 
any of Y^ Y^ and Y^ if present can be the same or difforent finom any one otiier, wherein each of 
10 cS c^ and c^ is 0 to aboii 20, \sdierein X^ and X^ are each attached to an amino add residue that 
flanks said diq>layed peptide sequence, and 

wherem said library does not contain more ^lan about 1 0% of displayed peptide 
sequcsnices different from said first mentioned displayed peptide sequences. 

15 42. A method of making said anchor library of claim 1, comprising: 
synthesizing a collection of nucleic acid sequences; 

inserting said nucleic add sequences into vectors to give recombinant vectors; 
introducing said recombinant vectors into a host; 

propagating said host having said recombinant vectors so as to result in a collection of 
20 recombinant vectors, each of said recombinant vectors having a nucleic add sequence &om said 
collection of nucleic acid sequences which encodes a displayed peptide sequence; 
said displayed peptide sequence comprising: 

xMy^)^iX^{Y^)^2X^(Y^)^3X^ 

wherein each X^ X^, X^ and X* is an aniino acid residue and any of XS X^, ^ 
same or different fi^om any one other, ^Aierein each Y\ Y^ and Y^ is alanine or glycine or a 
25 combination of alanine and glycine that is respectively c', c^ and c^ amino add residues long and 
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any of Y* , and V if present can be ^ same or diffepent ftwn any one other, wherein eadi of 
c', and c' is 0 to about 20, wherein X' and X* are each attached to an amino add residue that 
flanks said displayed peptide sequence, and 

wherein at least about 10^ to about 10* permutations of all possible permutations of said 
5 dispkyed peptide sequence are present in said anchor library. 

43. The method of claim 42 wherein said library does not contain more than about 10% of 
di^kored peptide sequences different fiom said first mentioned displayed peptide sequences. 

10 44. A mefliodofuang said anchor library ofclaunl to idaitify a peptide sequence that binds 

to a ta^et, comprising: 

providing said anchor library of claim 1, said anchor library having a collection of 
rec(»nbinant vectors, eadi of said recombinant vectors havmg a nucleic add sequence which 
encodes a displayed peptide sequence conqtrisang 

15 pmnitting the ejqiression and di^lay of said peptide sequence; 

contacting said anchor library with said target under conditions in vAaxAi said displ^ed 
peptide sequence binds to said target: and 

idoitifying said displayed peptide sequence which binds to said target 

20 45. The method ofclaim 44 wherein said contacting step is by affinity purificatioa 

46. The method of claim 44 wherein said identifying step comprises amplifying said 
recombinant vector which has said nucleic acid sequence which encodes for said displayed 
pq)tide sequence which binds to said target, and sequencing said nuddc add sequence to 
25 determine said displayed peptide sequence vMch binds to said tai^et 



47. The metiiod of claim 44 further comprising the step of sy nt h e sizing said identified 
displayed peptide sequence. 
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48. The method of claim 44 A^erein said target is selected from the groiq) consisting of 
proteinaceous molecules and non-proteinaceous molecules. 



49. The method of claim 44 wherein said target is selected fix>m ihe groiq> consisting of 

S ligands, receptors, hormones, cytokines, antibodies, antigens, en^ones, enzyme substrates and 
viruses. 

50. The mefiiod of claim 44 ^dierem said peptide sequrace fartba comprises at least one B, 
said B being a core binding sequence of p amino acid residues in lengttL 

10 

5 1 . The method of claim 44 wherein said peptide sequence further comprises at least one 
cysteme residue. 

52. The method of claim 44 herein at least one of said X\'X}^'X? and X"^ residues is a 
IS cysteine residue. 

53. A pq)tide identified by use of said library of claim 1 v/idch peptide is useful as a 
diagnostic or &erapeutic product in that said peptide is able to bind to a target molecule vAdch is 
involved in a physiological process. 

20 

54. In a library having a collection of nucleic acid molecules oacoding peptides having 
random amino acids, the improvement comprising a library in which the random amino acids are 
not continuous so that the amino acids in a peptide that are important contacts for interaction 
between said peptide and a target molecule can be identified. 

25 

55. In a library having a collection of nucleic acid molecules ^coding pq>tides having 
random amino acids, the improv^ent comprising nucleic acid molecules encoding alanine or 
glycine or a combination of alanine and glycine residues in varying numbers acting as spacers 
between the random amino acids so that amino acid residues in a peptide that are important 

30 contacts for interaction between said peptide and a target molecule can be identified 

56. A collection of recombinant DNA molecules encoding peptide sequences having a 
plurality of different binding domains, said peptide sequences comprising: 
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iY^) iY^) ^zX^ [Y^) ^^X^ 

vfbet&n eadi X* , X?, and X* is an amino acid residue and any of X», X*, X' and X* can be the 
same or diffooit fiom ai^ one other, ^nterein each Y', and is alanine or glycine or a 
combination of alanine and glydne diat is reflectively c', c^ and c^ amino add residues long and 
any of Y', Y^ and Y' if present can be tiie same or different ftran any one oAer, vAna&n each of 
5 c\(^ and c' is 0 to about 20, \\^erem X' and X" are each attached to an amino add residue diat 
flanks said peptide sequoice, and 

vi^erem at least about lO' to about lO" pennutations of all pos^le permutations of said 
peptide sequence are presrat in said collection. 

10 57. The coUectionofrecomlnnantDNA molecules of claim 56 \vherein said coUect^ 
not contain more tiian about 10% of di^layed p^de sequences differoit fitom 
mentioned diq>layed peptide sequences. 

58. The collection of recombinant DNA molecules of claim 56 wherein said peptide 

15 sequences arc displayed on the surface of a biological material selected from the group consistii^ 
of a virus, phage, cell, spore and gene product 

59. A recombinant filamentous phage having a displayed peptide sequence vwdi known 
binding properties, 

20 said displayed peptide sequence being foreign to said filamentous phage, 

said displayed peptide sequence comprising: 

X^ {Y^) ^xX^ {Y^) ^zXUy^) ^,X^ 

wherein each X', X^, X* and X" is an amino add residue and any of X', X^, X^ and X* can be the 
same or difEerent from any one other, wherein each Y', Y* and Y* is alanine or glycine or ai 
combination of alanine and glycine that is respectively c', c^ and c^ amino add residues long and 
25 any of Y', Y^ and Y^ if present can be the same or different from any one other, wherein each of 
c^ c^ and c* is 0 to about 20, wherein at least one of Y', Y* and Y* is at least about one amino 
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acid residue long, wherein and X"^ are each attached to an amino acid residue that flanks said 
displayed peptide sequence, and 

said displayed peptide sequence being able to bind to a target 

5 60- A recombinant vector having a nxicleic acid sequence inserted in a gene, said nucleic acid 
sequence encoding a displayed peptide sequence having known binding properties, 
said displayed peptide sequmce comprising: 

wherein each X', X^, X!' and X* is an amino add residue and any of X', 5?, X? and X* can be the 
same difEetent fiom any one other, vdierem each and is alanine or glydne or a 
10 combination of alanine and glycine that is respectively c', c^ and (? ammo add residues long and 
any of Y', Y^ and Y* if present can be the same or different from any one otha:, vtdiecein each of 
c^ c^ and c* is 0 to about 20, wherdn at least one of Y^ Y^ and Y' is at least about one amino 
add readue long, v^erem X' and X^ are eadi attached to an amino add reddue that flanks 

displayed peptide sequence, and 
15 said displayed peptide sequence being able to bind to a target 

61. A recombinant nucleic acid molecule having a nucleic acid sequence inserted in a gene, 
said nucleic acid sequence encoding a displayed peptide sequence having known bindmg 
properties, 

20 said displayed peptide sequence comprising: 

X^{YMciX^(y2)^2X^(Y^)j.3X^ 

wherdn each X*, X?, X^ and X^ is an amino add residue and any of X', X^, X? and X* can be the 
same or different ftom any one other, wherein each Y', Y^ and Y' is alanine or glydne or a 
combination of alanine and glycine that is respectively c', c^ and c' amino acid residues long and 
any of Y', Y* and Y' if present can be the same or different finom any one oAer, wiierein each of 
25 c', c* and c* is 0 to about 20, wherein at least one of Y' , Y* and Y' is at least about one amino 
acid residue long, wherein X' and X" are each attadied to an amino acid residue that flanks said 
displayed p^tide sequence, and 
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said diq>layed peptide sequence being able to bind to a taiget 



62. A recombinant protein having a displayed peptide sequence having known binding 
properties, 

5 said displayed pqjtide sequoice comprising: 

vdieran ea<A 3?, and X* is an amino add residue ami any of X^, X? and X* can 
same or different j&om any one other, vrfierein each Y^ and Y' is alanine OT 
combination of alanine and glycine that is respectively c', c? and c* amino acid residues long and 
any of Y', Y^ and if present can be the same or different firom any <me other, \^aem each of 
10 c' , c? and c' is 0 to about 20, wherein at least one of Y' , Y^ and Y^ is at least about one amino 
acid residue long, \<ai!erem X' and X* are each attached to an amino add residue that fl a nks said 
displayed pq;>tide sequmce^ and 

said displayed peptide sequence bdng able to bind to a target 

IS 63. An anchor library, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having anuddc add sequence inserted in a 
gene, said nucldc acid sequence encoding a displayed peptide sequetice, 

said displayed peptide sequence of each of said vectors comprising 

X^(yMciX'(y2)c2X3(Y^)c3X^ 

20 wherein each X', X?, X^ and X* is an amino acid residue and any of X', X?, X? and X* can be the 
same or different fiom any one other, wherein each Y', Y^ and Y* is any specified amino add or 
combination of specified amino adds that is respectively c', c* and c' amino acid residues long 
and any of Y', Y^ and Y^ if present can be the same or different fiom any one other, wherein eadi 
of c', c? and c* is 0 to about 20, wherein X' and X* are each attached to an amino add residue that 

25 flanks said displayed peptide sequence, and 
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wlierdn at least about 10* to about 10' pmnutations of all possible pennutatioDS of said 
displayed pqytide sequence are present in said anchor libraiy. 



64. The library of claim 63 wherein said lilwary does not contain more than about 10% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 

65. The libraiy ofclaim 63 v*erein each Y*,Y2 and is alanine or cysteine or a 
combination of alanine and cystine. 

66. The libraiy of claim 63 wherein each Y', Y^ and Y' is glycine or cysteine or a 
cornlnnation of glydne and cystdne. 

67. An anchor libraiy, coiqprising: 

a collection of recombinant vectors, 

eadi of said lecomlnnant vectors having a nucleic add sequence insoled in a 
gene, said nucldc add sequoice encoding a displayed peptide sequence, 

said displayed pqitide sequence of each of said vectors comprising 

X^(Y^)^,iX2(y2)^2X^(Y^)c3X' 

wherein each X', X^, X' and X* is an amino add residue and any of X', X?, X^ and X* can be the 
same or different from any one other, >^erein each Y', Y^ and YMs alanine or glydne or a core 
binding sequence B of p amino acid residues in length or a combmation of alanine and glydne or 
alanme and B or glycine and B, that is respectively c', c^ and c' amino acid residues long and any 
of Y*, Y^ and Y' if present can be the same or different &om any one other, ydieidn each of c', c^ 
and c^ is 0 to about 20, wherein X' and X* are each attached to an amino add residue that flanks 
said displayed peptide sequence, and 

wherein at least about 10* to about 10* permutations of all possible permutations of said 
displayed peptide sequence are present in said mchot libraiy. 

68. The library of claim 67 wherein said libraiy does not contain more than about 10% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 
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69. An anchor libraiy» comprising: 

a collection of recombinant vectois, 

each of said recombinant vectors having a nucleic add sequence inserted in a 
gene, said nucleic acid sequence encoding a displayed peptide sequmce, 
S said displayed peptide sequence of each of said vectors conqrasing 

iTvherein each 1} , Z^, 7} and Z"^ is an amino acid residue or a core binding sequence B of p amino 
acid residues in length and any of Z^ Z^, Z^ andZ^ canbetiie same or different fiom anyone 
other, wherein each Y\ and is alanine or glycine or a combination of alanine and glycine 
that is respectively c \ c^ and c^ amino acid residues long and any of Y^ Y^ and Y^ if present can 
10 be the same or different fix>m any one other, wfa0:dnea(^ of c^(^a^ 

wfaeran 7} and Z^ are each attached to an amino add reddue that flanks said displayed peptide 
sequence, and 

A^erein at least about 10^ to about 10^ permutations of all possible permutations of said 
displayed peptide sequence are present in said anchor library, 

15 

70. The library of claim 69 wherein said library does not contain more than about 10% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 
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1. I I Claims Nos.: 

' — ' frf^itef. ff|#y wJato tft m Ajgct matter not required to be searched bv this Authority, namely: 

2. I I Claims Nos.: 

*— ' becauseth^idateto parts of the iotentttiooalai^rficatioatl^ 

an ff K frn l that no meaningful international search can be carried out, spe c ificall y : 

3. ri Claims Nos.: 

— because they are dq)endent chims and axe not drafted in accordance wkh the second and third sentences of Rule 6.4(a). 

BoQcII C Bfftf ff »iM^ ffiifty ftf invpntirm k tacking (Contin n atio n of item 2 of first sheet) 

This ln !y*»«*t;»«*i Seaidung Anthoiily found multiple inventions in this international apf^jcation, as follows: 
Please See Extra Sheet. 
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A. CLASSIFICATION OF SUBJECT MATTER: 
IPC (6): 

COIN 33/543; C07K 2/00. 14/00; C12Q 1/68; C12N 15/09, 15/70, 15/74 

BOX n. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACfONG 
This ISA fiMtnd miib^^ iovaidoiis as follows: 

This appBcatkm cootains the foOowiiig inventioDS €t groups of inveodoos wfaidi aie ool so linked as to fonn a singje 
ioveiitiveGoiicci* under ICT Rule 13.1. lia Older for all hwcaaons to be cwimined, the appropriate addidonal 
rfamifwtkm feet BBUtf be paid. 

Gfoop U claiiii(a)l-52, and 63-70, dnwn to anchor vector Ebraries, method of making and u«ng said libraries, 
classified in class 436/518, and 435/6. 

Gtoup n, daim(s) 55, drawn to a peptide, clasrified in class 530/3OO+. 

Group m, c]aim(s) 54-58, drawn to a collection of nudeic acid molecules, classified in class 435/6. 

Group IV, claims 59-61, drawn to a phagp/vector displayiiig a peptide sequence, classified in class 435/320.1. 

Group V, daim 62, drawn to a recombinant protdn, classified in cbss 530/350+. 

The invendons listed as Grot^ I-V do not rdato to a single inventive concept under POT Ruk 13.1 because, under 
per Rule 13Jt, they lack the same or ooncq)onding ^>edal technical features for the following reasons: The only 
liftHng featuro among the claims is the fep6dc having the recited sequence. However, in the formula for this peptide 
sequence as shown, for exanq>]e, in daim 1, the amino adds rqnesented by *Y* need not be preset (because the values 
of o-l, 0*2, or 0-3 can be zero). This would give laise a collecdoa of tetrapq^tidesvothout any^)edfie requiremenis for 
the ^mm ^ adds. Such pqiddes axeobviously known in the prior ait. Accordingly the linking trchnical feature is not *a 
spedal r^yhrWI feature* as defined by PCT Rule 13.2, because it feib to make contribution over prior art. 
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